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The  errors  in  variables  problem  has  been  long  known  in  statistics; 
Adcock  (1878)  is  perhaps  the  first  reference  which  points  out  the  problem.  In 
the  simple  bivariate  regression  model  the  result  of  errors  in  variable  is  a 
downward  bias  (in  magnitude)  of  the  estimated  regression  coefficient:  the  "iron 
law"  of  econometrics  as  known  to  MIT  students.  During  the  formative  period  of 
econometrics  in  the  1930' s,  considerable  attention  was  given  to  the  errors  in 
variable  problem.  However,  with  the  subsequent  emphasis  on  aggregate  time  series 
research  the  errors  in  variables  problem  decreased  in  importance  to  most 
econometric  research.  In  the  past  decade  as  econometric  research  on  micro  data 
has  increased  dramatically,  the  errors  in  variables  problem  has  once  again  moved 
to  the  forefront  of  econometric  research. 

Solutions  to  the  errors  in  variables  problem  for  the  linear  regression 
model  have  been  well  explored  and  are  often  used  by  econometricians .  The  most 
common  solution  is  the  use  of  instrumental  variable  estimation  (IV)  which  depends 
on  the  existence  of  an  appropriate  instrument  or  repeated  observation  of  the 


MIT,  Princeton  University,  and  University  of  Wisconsin.  We  thank  Greg  Leonard 
for  excellent  research  assistance  and  the  National  Science  Foundation  for 
financial  support.  A.  Deaton,  A.  Lewbel,  R.  Pollak,  D.  Jorgenson  and  J.  Poterba 
made  helpful  suggestions.  Presented  as  the  Jacob  Marschak  Lecture  of  the 
Econometric  Society  at  the  1988  Australian  Economics  Congress. 
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The  notion  of  the  "iron  law"  is  that  the  estimated  effect  is  (almost)  never  as 

large  as  economic  theory  or  the  applied  researcher  expects  it  to  be.   Of  course, 

in  the  multiple  regression  situation  with  many  right  hand  side  variables  the 

result  need  no  longer  hold  true.   Nevertheless,  the  folklore  in  econometrics  plus 

years  of  reading  students  econometrics  papers  results  in  the  belief  that  downward 

bias  in  coefficients  estimates  is  a  pervasive  problem  in  micro  data  parameter  estimates. 

3 

Griliches   (1986)   discusses  micro  data  problems  which  lead  to  errors  in 

variables  problems  in  many  typical  econometric  data  sets. 


•2- 


variable  measured  with  error.  Two  other  solutions  exist,  but  they  have  only 
infrequently  been  used  by  econometricians .  The  first  alternative  solution 
involves  knowledge  or  an  estimate  of  the  variance  of  the  measurement  error(s)  of 
the  right  hand  side  variable(s)  or  its  relative  size  compared  to  the  variance  of 
the  stochastic  disturbance,  which  can  be  partly  or  entirely  composed  of  the 
measurement  error  in  the  left  hand  side  variable.  This  type  of  knowledge  is 
usually  not  available  to  econometricians.  The  second  alternative  solution  is  to 
use  distributional  properties  of  the  right  hand  side  variables  or  to  use  higher 
order  moments  which  depend  on  distributional  assumptions,  beyond  the  first  two 
moments,  to  estimate  the  parameters.  This  approach  again  has  only  rarely  been 
used  by  econometricians. 

Thus,  the  IV  appropach  is  by  far  the  most  widely  used  technique  for 
dealing  with  errors  in  variables  problems  in  linear  multiple  regression  problems. 
The  linear  model  with  measurement  error  is  isomorphic  to  a  linear  simultaneous 
equation  model,  so  that  two  stage  least  squares  or  a  closely  related  estimator  is 


Zellner  (1970)  and  Goldberger  (1972)  extend  the  single  equation  errors  in 
variables  problem  to  the  multiple  equation  context.  Geraci  (1977),  Hausman 
(1977)  and  Hsiao  (1976)  consider  the  errors  in  variables  problem  in  the 
simultaneous  equations  situation.  An  excellent  survey  is  given  by  Aigner,  Hsiao, 
Kapteyn,  and  Wansbeek  (1984). 

Reiersol  (1950)  demonstrated  identification  of  the  errors  in  variables  problem 
using  distributional  assumptions.  Kapteyn  and  Wansbeek  (1983)  generalize  the 
result  to  the  multiple  regression  context.  Bickel  and  Ritov  (1987)  apply  the 
method  in  an  adaptive  estimation  framework.  Geary  (1942)  originally  proposed 
using  higher  order  moments  in  estimation  in  the  errors  in  variables  problem. 
Higher  order  moments  may  offer  a  useful  methodology  given  the  increasingly  large 
data  sets  of  many  thousands  of  observations  used  by  econometricians.  They  can  be 
applied  in  a  straghtforward  method  of  moments  procedure.  However,  the  technique 
has  been  little  used  to  date  and  J.  Hausman  has  been  unsuccessful  in  a  few 
previous  attempts. 
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used.  However,  this  relationship  no  longer  holds  in  the  nonlinear  regression 
framework  as  recently  noted  by  Y.  Amemiya  (1985).  The  reason  that  2SLS  no  longer 
leads  to  a  consistent  estimator  in  the  nonlinear  errors  in  variables  problem  is 
because  the  error  of  measurement  is  no  longer  additively  separable  from  the  true 
variable  in  the  nonlinear  regression  model.  Application  of  2SLS  or  nonlinear 
2SLS  (N2SLS)  leads  to  inconsistent  estimates. 

A  staightforward  way  in  which  to  see  why  2SLS  or  N2SLS  does  not  yield 
consistent  estimators  in  the  nonlinear  errors  in  variable  model  is  to  consider 
the  linear  in  parameters  and  nonlinear  in  variables  specification: 

(1.1)  yi  -  /So  +  Pi   g(2i)  +  *i    i  ~  1 n 

where  g(z)  is  a  sufficiently  smooth  function  to  do  Taylor  approximations.  As  in 
the  linear  errors  in  variables  framework,  we  assume  that  z  is  unobservable ; 
instead  the  observed  varible  Xj_  takes  the  form: 

(1.2)  X£  -  Zi  +  r?i      i  -  1 n 

where  tj^  is  assumed  to  be  uncorrelated  with  z^.  Replacing  the  unobservable  z 
with  x  in  equation  (1.1)  and  taking  a  Taylor  expansion  leads  to: 


Fuller  (1987)  discusses  many  of  these  other  estimators  which  may  improve  the 
finite  sample  performance  of  IV-type  estimators  in  the  errors  in  variables  context. 

The  failure  of  additive  separability  also  arises  in  the  reduced  from  of  the 
nonlinear  simultaneous  equations  problem  where  the  additive  stochastic 
disturbance  of  the  structural  form  enters  nonlinearly  into  the  reduced  form. 
This  situation  leads  to  N2SLS  and  N3SLS  being  inefficient  relative  to  ML  in  the 
nonlinear  simultaneous  equations  model  as  demonstrated  by  T.  Amemiya  (1977) . 
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(1.3)  yL  -  P0  +  Pi   g(xi)  +  «i  -  ft  g1(xi)r?i  -  0!  2  gUl  (Xi)  fu.J/J'1 


where  [ j  ]  denotes  the  j th  derivative  of  g.  Inspection  of  the  first  term  of  the 
Taylor  expansion  in  equation  (1.3)  demonstrates  the  fundamental  problem  with 
N2SLS  or  other  IV  techniques.  The  instrument  must  be  correlated  with  g(x^) ,  but 
be  uncorrelated  with  rj^  and  e^.  However,  the  first  term  of  the  Taylor  expansion 
contains  both  r/^  and  the  derivative  of  g(x^) .  In  the  linear  errors  in  variables 
framework  the  first  derviative  of  g(x^)  is  unity  so  the  observation  error  is 
linearly  separable  from  the  right  hand  side  variable.  This  linear  separation  is 
not  present  in  equation  (1.3)  so  that,  even  if  the  higher  order  terms  of  the 
Taylor  expansion  were  absent,  it  is  unlikely  that  an  appropriate  instrumental 
variable  would  exist.  However,  the  additional  factor  of  the  added  terms  in  the 
Taylor  expansion  beyond  the  first  make  the  problem  even  more  unwieldy  to  solve 
with  the  usual  instrumental  variable  techniques. 

To  date  the  methods  proposed  to  estimate  the  nonlinear  errors  in 
variable  model  depend  on  very  strong  restrictions  on  the  distribution  of  the 
measurement  errors  of  the  unknown  regression  coefficients.  However,  knowledge  of 
the  parametric  form  of  the  distribution  function  of  the  measurement  errors  is  not 
sufficient  for  consistent  estimation.  An  additional  assumption  is  needed  that 
the  true  values  of  the  regressors,  e.g.  the  Zj_  in  equation  (1.1),  are  also 
assumed  to  be  random  drawings  from  a  distribution  with  a  known  parametric  form. 
Instead,  if  the  true  regressors  are  treated  as  fixed  but  unknown  constants,  then 
maximum  likelihood  estimation  is  inconsistent  due  to  the  "incidental  parameters" 
problem  of  Neyman  and  Scott  (1948). ^ 


8 

Aigner  et.  al .  (1984)  have  a  brief  discussion  on  ML  for  nonlinear  errors  in 

variables  models. 
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An  alternative  appi"oach  assumes  that  a  large  number  of  measurements  on 
each  true  regressor  exist,  so  that  the  average  of  these  measurements  closely 
approximates  the  true  regressors.  Consistent  estimation  of  nonlinear  errors  in 
varibles  models  then  follows  because  the  covariance  matrix  of  the  measurement 
errors  for  the  regressors  approaches  zero  as  the  sample  size  increases. 
Estimators  under  this  type  of  assumption  have  been  proposed  by  Villegas  (1969) , 
Dolby  and  Lipton  (1972),  Wolter  and  Fuller  (1982b),  Powell  and  Stoker  (1984),  and 
Y.  Amemiya  (1985).  This  situation  seems  unlikely  to  occur  very  often  in 
econometrics . 

Lastly,  Griliches  and  Ringstad  (1970)  analyzed  a  quadratic 
specification  and  demonstrated  that  the  bias  of  least  squares  can  be  exacerbated 
by  the  nonlinearity .  Wolter  and  Fuller  (1982a)  propose  a  consistent  estimator 
for  the  quadratic  specification  so  long  as  the  errors  are  normally  distributed. 
Neither  instrumental  variables  nor  additional  measurements  are  required  for 
estimation. 

In  this  paper  we  discuss  consistent  estimators  for  nonlinear  regression 
specifications  when  errors  in  variables  are  present.  Our  estimators  depend  on 
the  existence  of  instrumental  variables  or  a  single  repeated  observation.  Thus 
we  do  not  require  the  large  number  of  measurements  or  shrinking  covariance  matrix 
assumption  of  much  previous  research. 

In  Section  2  we  discuss  an  estimator  for  polynomial  specifications  in 
the  true  regressors.  This  estimator  proposed  by  Hausman,  Ichimura,  Newey,  and 
Powell  (HINP)(1986)  leads  to  consistent  and  asympotically  normal  estimators  so 
long  as  either  instrumental  variables  or  an  additional  measurement  of  each  true 
regressor  are  present.    An  interesting  result  emerges  in  the  instrumental 
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variable  case  because  the  model  turns  out  to  be  overidentif ied.  Thus,  tests  of 
the  model  specification  are  possible. 

In  Section  3  we  discuss  an  estimator  for  the  general  nonlinear 
specification  when  errors  in  variables  are  present.  This  estimator  proposed  by 
Hausman,  Newey ,  and  Powell  (HNP)(1988)  yields  a  consistent  estimator  when  an 
additional  measurement  of  each  true  regressor  is  present.  To  date,  we  have  not 
been  able  to  establish  asymptotic  normality  of  the  estimator  or  to  extend  it  to 
the  instrumental  variable  situation.  However,  Monte  Carlo  evidence  provides  some 
indication  that  the  distribution  of  the  estimator  is  not  badly  behaved  so  that  we 
use  bootstrap  estimates  of  the  precision  of  our  estimates. 

In  Section  4  we  apply  our  methodology  to  estimation  of  Engel  curves  on 
household  data,  a  problem  which  econometricians  have  done  considerable  previous 
research  on.  Here  we  find  a  number  of  interesting  results.  First,  we  find  that 
the  "Leser-Working"  specification  of  budget  shares  regressed  on  the  log  of  income 
or  expenditure  should  be  generalized  to  higher  order  terms  in  log  income.  Also, 
we  find  that  errors  in  variables  in  either  reported  income  or  expenditure  should 
be  accounted  for.  However,  we  do  not  find  evidence  that  more  general  functional 
forms  beyond  polynomial  specifications  in  income  improve  estimation  of  the  Engel 
curve  significantly.  Lastly,  and  perhaps  most  interesting,  we  find  rather  strong 
support  for  the  Gorman  (1981)  rank  restriction  on  the  matrix  of  coefficients  for 
the  polynomial  terms  in  income.  Thus,  after  over  100  years  of  Engel  curve 
analysis,  a  restriction  from  economic  theory  may  affect  the  econometric 
estimation.  This  result  is  remarkable  if  future  research  leads  to  similar 
findings . 
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II.   Identification  and  Estimation  of  the  Polynomial  Functional  Model 

We  first  consider  estimation  of  the  parameters  of  the  polynomial 

specification 

K 

(2.1)  yt  =  S  0*-   (Zi)J  +  £i      i  -  l,...n 

j=0  J 

which  is  a  kth  order  polynomial  in  the  unobservable  variable  z^.  We  will  treat 
Zj_  as  a  random  variable  with  unknown  distribution  function;  alternatively,  the 
{zj_}  can  be  interpreted  as  a  sequence  of  fixed  constants  with  appropriate 
modifications  in  the  regularity  conditions.  The  observed  variable  x^  has  the 
same  relationship  to  the  unobserved  variable  Zj_   as  in  Section  I: 

(2.2)  xj_  -  Zi  +  rji    i  -  1 n. 

Our  first  estimator  uses  the  additional  information  of  a  single  repeated 
measurement  w^  of  z^  with  an  additional  measurement  error  v^  defined  by 

(2.3)  w^-z^  +  v-^    i-l,...,n. 

Two  points  of  interest  arise  from  the  specification  and  assumptions  of 
equation  (2.3).  First,  we  will  assume  that  v^  is  uncorrelated  with  e  £  and  rjj_  and 
is   independent   of  Zj_.  The   independence   assumption   is   required   by   the 

nonlinearity .  These  assumptions  are  analogous  to  the  linear  case  so  that  w^ 
could  be  used  as  an  instrumental  variable  if  the  specification  of  equation  (2.1) 
were  linear.  Second,  we  will  not  impose  the  usual  restriction  E(v^)  -  0  so  that 
a  constant  term  can  be  present  in  the  measurement  equation  (2.3).  Alternatively, 
vj  can  be  assumed  to  have  zero  mean,  but  the  slope  coefficient  of  zj  in  equation 
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(2.3)  can  then  be  non-unity.  The  details  of  the  derivation  of  the  estimator  in 
this  latter  case  is  left  to  the  interested  reader. 

We  now  turn  to  sufficient  regularity  conditions  to  allow  identification 
and  estimation  of  the  /3j  in  equation  (2.1).  Define  the  matrix  norm  |  |A|  |  = 
maxi > j | ajj  |  .   We  make 

Assumption  1:  The  random  variables  e^,  r?^,  vj_  ,  and  z^  are  jointly 
i . i . d.  with 

(i)       E  (£i  |  zi(  Vi)  -  E  (r/i  |  Zi,  V£)  -  0 
(ii)      vj  is  independent  of  z^ 
(iii)      E  1 1  (eit  Vi,    Vi2K,  Zi2K)||2  <  cc 
(iv)      All  necessary  moment  matrices  are  nonsingular. 

The  i.i.d.  assumption  can  be  relaxed  to  allow  for  either  dependence  or 
heterogeneity  or  both.  Assumptions  A.l(i),  (iii),  and  (iv)  are  standard 
assumptions  to  allow  derivation  of  both  identification  and  the  asymptotic 
distribution  of  the  estimator.  Only  assumption  A.2(ii)  is  stronger  than  in  the 
usual  linear  case.  Independence  is  necessary  because  of  the  nonlinear 
specification.  Note  that  assumption  A.l(i)  could  be  strengthened  to  independence 
for  purposes  of  symmetry;  we  require  only  the  weaker  no  correlation  assumption. 

For  identification  we  consider  the  population  analogue  of  the  normal 
equations.  Define  the  moments  £p  -  E[y^  (z^)P]  for  p  -  0.....K  and  4>m  -  E[(zi)m] 
for  m  -   0.....2K.   The  normal  equations  z'y  -  (z'z)  /J  take  the  form 


K 


(2.4)   Cp  -  ZQ  Pi    *j+p  P  -  0 K 
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Both  sides  of  equation  (2. A)  depend  on  the  unobservable  variables  z^.  However, 
the  unobservable  moments  can  be  derived  from  the  observable  moments  E[x^  (w^)P] , 
E[(wi)P],  and  E[yj  (w^)P] .  We  now  use  assumption  A.l  and  the  fact  that  E[(wj_)°] 
=  E[(Zi)°]  -=  E[(vi)°]  =  1  to  find: 

(2.5)   E  -  [Xi  (wi)J-1]  -  J2q  [Jp  ]  <I>p+1  ^_pl   for 


p-0 

j  -  1 ,2K  where  v$   -  E  [ (vt) J ] . 


(2.6)   E  [(Wi)J]  -  L   [J]  *p  ^j.p   for  j  -  1 2K. 


(2.7)   E  [yi(wi)J]  =  SQ  [p]  Cp  i/j-p   for  j  -  0 K. 


Equations  (2.5)  and  (2.6)  allow  identification  of  z'z,  and  equation  (2.7)  then 
allows  identification  of  z'y.  That  is,  equations  (2.5)-(2.7)  defined  (5K  +  1) 
equations  which  have  a  one-to-one  relationship  between  the  moments  of  the 
observable  variables  and  the  (5K  +  1)  elements  of  the  unobservable  moment  vectors 
$  and  v  each  of  which  have  2K  elements  and  £  which  has  K  elements.  HINP  (1986) 
derive  recursive  relationships  which  permit  convenient  solution  of  the  elements 
of  the  parameter  vector  6  -  ($'  ,  u'  ,  £').  Once  6  is  computed,  /3  is  then 
identifiable  as  a  solution  to  the  normal  equations  (2.4). 

To  derive  the  asymptotic  distribution  of  the  estimator  define  the  (5K  + 
1)  dimensional  data  vector 

(2.8)  mi  -  [xi,...,xi(wi)2K-1,  Wi (wi)2K,  yi,...,yi(wi)K]. 

Define   the  population  moments   to  be  fi     —     E[m^]  .    The  moment  vector  /i  is 
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consistently  estimated  by  the  sample  average  moment  vector  m  and  application  of 
the  Lindeberg-Levy  CLT  yields  the  asymptotic  distribution  of  m 

d 
(2.9)  Jn   (m  -  /*)  -»  N  (o,  fi)   for  fi  =  E  [m^mi']  -  w'  . 

The  elements  of  9  are  then  estimated  by  the  continuous  and  continuously 
diffentiable  relationship  6  =  h(/i)  .  First  order  approximations,  also  known  as 
the  delta  method,  lead  to  the  asymptotic  distribution  of  G 


d 
(2.10)  Jn   (5  -  6)   ■+   N  (0,  HfiH')  for  H  -=  dh(fi)/8^' 


The   elements   of  the  Jacobian  matrix  H  can  be  calculated  recursively  with 
computational  details  given  in  HINP  (1986). 

Lastly,  we  solve  for  fi  using  the  normal  equations  (2. A)  and  the 
estimated  0.  Let  D  be  the  second  moment  matrix  of  (1,  z^,  ...,  (z^)^)  and  D  - 
D(9)  based  on  the  estimated  6.   We  estimate  fi   by 

(2.11)  fi  -  fr1  i. 

Define  S|  as  the  selection  matrix  which  gives  Sc    6  -  {  and  S$  as  the  selection 
matrix  which  gives  S$  6  -  vec(D) .   HINP  (1986)  derive  the  asymptotic  distribution 


i_        d 
(2.12)   7n  (fi  -   fi)   -  N  (0,  V)  where 

V  -  D"1  [  S^  -  (fi-    ®  IK+1)  S$  ]  H  fi  H'  [S£  -  (/3'®IR+1)  S^'D"1 


We  have  demonstrated  identification  and  derived  a  consistent  and  asymptotically 
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normal   estimator   in  the  case   of  a  single  replicated  measurement  for  the 
unobservable  variable  z^. 

We  now  turn  to  identification  and  estimation  when  instrumental 
variables  which  allow  prediction  of  the  unobserved  regressor  z^  are  available. 
Thus,  we  assume  that  z^  is  determined  by  the  p  dimensional  vector  of  instruments 

(2.13)  zi   -  qja  +  vj_        j_  -  1 n. 

To   state   sufficient  conditions   for   identification  and  estimation  we  change 

assumption  A.l.(ii)   to   an  assumption   that  v^   and  the   instruments  q^   are 

independent.    Otherwise  the  sufficient  conditions  are  quite  similar  to  the 
previous  case  that  we  considered: 

Assumption  2:    The  random  variables  e^,  rj^,  vj_ ,  and  q^  are  jointly 
i . i . d.  with 

(i)  E  («i  I  qi,  vj_)  -  E  (f/i  |  qi,    v^  -  0  ,   E  (ci^i  [  qi ,  Vi)  -  a €r? 

(ii)  vj    is    independent   of   q^  with   E    [vjj    -  0 

(iii)  E    [HCci,    r,0\\2    ]   <-,    E    tMvi,    qi||2(K  +  X)    ]   <« 

(iv)  All  necessary  moment  matrices  are  nonsingular. 

Again,  the  i.i.d.  assumption  can  be  relaxed  to  more  general  situations. 

For  purposes  of  identification  we  assume  that  a  is  known  since  it  is 
identified  from 

(2.14)   xj[  -  q^a  +  »7i  -  V£         i  -  l,...,n. 


■12- 


Let  wj  -  qia    and  again  denote  «/j  -  E[(v^)J].   We  must  again  determine  the  i/j  for 
identification  and  estimation. 

Substitution  of  the  instrumental  variables  into  equation  (2.1)  yields: 


K 


(2.15)   (i)      yi  -  S  7i  (wi)J  +  ei   where 

j-0  J 


K   p 

(ii)      7j  -   S   [  ]  /?p  yp.j     j-0 K 


(iii)     ei  -  ei  +  2   J   [P]  /3p  [(vi)p-J-^    ]  (wi)J 

j-0   P-J   j    F  P  J 


Equation  (2.15)  (iii)  implies  that  E(e^  |  wj)  -  0  so  that  7  is  identified  by  the 
least  equares  projection  of  equation  (2.15)  (i). 

We  now  have  the  convolution  of  f)  and  v  in  7  which  must  be  sorted  out 
for  identification.  Before  proceeding  to  do  so,  note  that  since  vq  —  1  and  v\  — 
0,  we  have  7^  —  /S^  and  7^-1  —  ^k-1-  Thus,  the  two  highest  elements  of  /9  are 
identified  from  equation  (2.15)  (i)  alone  which  will  subsequently  lead  to 
over identification. 

To  complete  the  identification,  we  now  multiply  through  equation  (2.1) 
by  the  observable  variable  x^  and  we  substitute  z^  -  w^  +  v^_ : 

K+l 


(2.16)         (i) 


xivi   "    -Sn   <H(wi)-'    +  ui  where 


K 


6i  -^  [P]  ViVj         J  "1 K+1 
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K+l   K+l 


sP'J 


jS0  pSj  t'l  £p_l  t(vi)1-  J  •  Vj)  (Wi)J  +  [r,iyi  -  ain\ 
+  z^e*       j_  -  1 n. 


Again,  the  disturbance  term  in  equation  (2.16)  (i)  has  zero  conditional 
expectation  so  that  E[u^  |  w^ ,  vjj  -  0.  The  estimate  of  5  follows  from  the  least 
squares  projection  of  equation  (2.16)  (i).  We  can  again  identify  the  two  highest 
order  terms  of  /9  from  Sj^+l  ~  ^K  anc^  ^K  "  ^K-l-  Over  identification  of  these  two 
parameters  follows  from  the  7  and  6    coefficients. 

Thus,  we  have  (2K  +  3)  reduced  form  coefficients  70>---i7K> 
c$0 ,  -  -  •  ,  <$k+1  •  We  have  K+l  unknown  0  coefficients  and  K  unknown  nuisance 
parameters  in  u.  Thus,  we  have  overidentif ication  of  order  2,  or  equivalently, 
we  can  discard  two  equations  and  still  identify  the  unknown  parameters.  The 
solution  to  the  equations  once  again  follows  a  convenient  recursion  relationship. 
HINP  (1986)  give  the  recursion  formulae. 

Estimation  proceeds  from  initial  estimation  of  the  reduced  form 
parameters  7  and  6.  Given  7  and  6,  we  then  estimate  f)  and  v.  This  two  step 
procedure  need  not  be  asymptotically  efficient;  the  topic  is  left  to  future 
research.  The  derivation  of  the  asymptotic  distribution  of  the  resulting 
estimator  is  straightforward,  but  tedious.  We  give  only  a  sketch  here  and  direct 
the  interested  reader  to  HINP  (1986)  who  give  the  complete  derivation.  Taking 
account  of  the  fact  that  a  must  be  estimated,  the  asymptotic  distribution  of  the 
reduced  form  parameter  is  normal,  say 


(2.17)   /n" 


7 
6 


7 
8 


d 

.  1 

0 

.  | 

0 
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Given  equation  (2.17)  we  can  then  obtain  efficient  estimates  of  the  /9's  by 
minimum  chi-square  estimation.  Denote  the  (2K  +  2)  vector  of  reduced  form 
coefficients,  after  elimination  of  Sq   by 

(2.18)  ft  -  (7,  S) 

-=  (tli,    fl2)  where  ftx  -  (7K,  7^,  *K+1 '  ^k)  ' 

Similarly,  denote  the  2K  vector  of  b  and  u   parameters  as 

(2.19)  6  -  (£,  v) 

-    (elt  62)  where  &1   =    (0K,  0^) . 

The  unknown  6  parameters  follow  from  the  reduced  form  II  parameters 

(2.20)  n  -  h(6) 

so  rearrange  the  covariance  matrix  M  from  equation  (2.17)  to  conform  to  equation 
(2.20)  and  denote  a  consistent  estimate  of  the  rearranged  matrix,  V. 
The  minimum  chi  square  estimator  is  then 

(2.21)  Q  -  min   [ft  -  h  (e)  ]  '   v"1  [ft  -  h(e)]. 

e 

The  value  of  Q  from  equation  (2.21)  can  be  used  to  test  the  overidentifying 
restrictions  since  it  is  distributed  as  a  central  chi  square  random  variable  with 
2  degrees  of  freedom  if  the  specification  is  correct.   The  test  of  identification 
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is  equivalent  to  a  test  of  equation  (2.2):  a  non-zero  mean  and  non-unity  slope 
coefficient  of  zj  cause  the  system  to  be  just  identified.  The  asymptotic 
covariance  matrix  of  the  minimum  chi  square  estimator  takes  the  usual  form 


(2.22)   Tn" 


P    -    ft 


N  (0,  (HV'l-H')"1)  where  H  -  3h(e)/d6'. 


The  specification  of  equation  (2.1)  does  not  contain  other  variables 
besides  the  polynomial  terms.  However,  in  many  applications  we  might  expect  the 
appropriate  specification  to  be 


K 


(2.23)   yi  -  2Q  0j  (Zi)J  +  R^  +  ei 


1 n. 


where  we  assume  that  the  Rj  are  measured  without  error.  The  usual  partialing  out 
technique  does  not  work  for  equation  (2.23)  because  of  the  nonlinear  errors  in 
variables  specification.  Instead,  we  apply  two  different  approaches  for  the 
replicated  measurement  and  instrumental  variables  setups.  The  additional  R^ 
variables  are  accounted  for  in  the  replicated  measurement  case  by  considering 
equation  (2. A),  the  normal  equations.  We  need  to  augment  z'y  and  z'z  to  include 
R.  Thus  we  need  to  form  the  matrices  R'z  and  R'y.  The  latter  matrix  is  directly 
observable.    The  matrix  R'z  depends  on  the  unobservable  variable  z;  however, 


Because  62  is  just  identified,  equation  (2.21)  may  be  further  simplified  using 
partitioned  inverses.  62  follows  from  II2 ,  while  the  overidentif ied  parameters  6^ 
are  estimated  from  11^ .   See  HINP(1986)  for  computational  details. 
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equation   (2.5)   with  R^  included  permits  computation  of  an  estimate  of  the 
required  moment  matrix. 

Inclusion  of  additional  regressors  in  the  instrumental  variables  case 
is  also  quite  straightforward.  The  addition  of  R^  to  equation  (2.15)  (i)  has  no 
effect  on  the  disturbance  e^  so  that  7j  in  equation  (2.15)  (ii)  remains  the  same. 
Similarly,  5-;  is  estimated  from  equation  (2.16)  (i)  after  the  term  w^R^  is  added 
to  the  right  hand  size  of  the  equation.  In  both  cases  least  squares  projections 
yield  consistent  estimates  of  -y*  and  6a  so  that  estimates  of  /3  and  v  can  be 
calculated  from  the  estimated  reduced  form  parameters. 

In  this  section  we  have  proven  identification  and  developed  consistent 
estimators  for  the  polynomial  specification  when  either  a  single  replicated 
measurement  or  instrumental  variables  are  present.  Additional  replicated 
measurements  can  be  included  easily;  a  minimum  chi  square  combination  of  the 
estimated  parameters  offers  a  convenient  approach.  Similarly,  a  replicated 
measurement  and  instrumental  variable  situation  can  be  combined  using  a  minimum 
chi  square  approach.  In  both  cases  we  increase  the  efficiency  of  our  estimator, 
or  alternatively,  we  can  test  the  specification  of  our  model.  However,  we  do  not 
claim  to  have  found  the  most  efficient  estimator  since  there  exists  an  infinite 
class  of  unconditional  moment  restrictions  which  can,  in  principle,  be  used  in 
estimation.  We  leave  the  construction  of  feasible  efficient  estimators  which 
attain  an  efficiency  bound  as  a  topic  for  future  research. 


This  approach  is  equivalent  to  using  the  transformed  replicated  measurement 
w  -  (I  -  R(R'R)"1R)w  together  with  equations  (2.4)-(2.7). 
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III.   Errors  in  Variables  in  a  General  Nonlinear  Specification 

We  now  discuss  identification  and  consistent  estimation  of  a  general 
nonlinear  errors  in  variables  specification  with  errors  in  variables  following 
Hausman,  Newey ,  and  Powell  (1988).  Our  estimator  is  limited  to  the  replicated 
measurement  case;  we  have  been  unable  to  extend  estimation  as  yet  to  the 
instrumental  variables  case.  Also,  we  do  not  currently  have  an  asymptotically 
normal  distribution  for  the  estimator.  We  lack  the  centering  correction  for  the 
estimator  which  would  permit  derivation  of  the  asymptotic  distribution.  However, 
limited  Monte  Carlo  investigations  indicate  that  the  bias  of  the  estimator  is 
small  and  that  bootstrap  estimates  of  the  sampling  distribution  of  the  estimator 
provide  a  good  indication  of  the  precision  of  the  estimates. 

We  consider  the  general  nonlinear  regression  model  with  additive 
disturbance : 

(3.1)   yi  -  f(zi?  0)    +   ej  i  -  1 n. 

where  we  take  z  to  be  a  scalar.  Inclusion  of  additional  variables  measured 
without  error  is  straightforward  using  the  approach  of  the  last  section.  The 
variable  zj_  is  unobservable ;  instead  we  observe  x^  as  in  equation  (2.2).  The 
replicated  measurement  w^_  is  determined  similarly  by  equation  (2.3).  Lastly,  we 
make  assumption  A.l  (i)-(ii)  of  Section  II  where  the  assumption  that  vj_  is 
independent  of  z^  is  crucial  for  our  estimator.  The  moment  assumptions  that  we 
make  are 
Assumption  3:    E  [  |  « i  1  2 ]  and  E  [  |f(zj_,  P)\2]    are  finite  and  there  exists 

T>0  such  that  E  [  exp  (T  |  (z^ ,  r)^,    Vj_)|}]  is  finite. 


■18- 


Using  these  assumptions  and  the  results  of  Section  II,  we  can  estimate  the 
coefficients  of  the  linear  projection  of  y^  on  polynomials  of  the  true,  but 
unobservable ,  regressors. 

We  denote  this  polynomial  approximation  using  the  estimated  moments  for 
the  normal  equations  (2.4)  as 


(3.2)   £p-  j£oftjKftj+p        p-0....K. 


where  II(K)  stands  for  a  Kth  order  polynomial.  Once  we  have  the  estimated  n(K) 
projection  coefficients  we  can  estimate  the  true  regression  function  fo(z)  " 
f(z,/?)  by 


(3.3)   £K(z)  -  .S0  njK  (zJ) 


which  is  a  nonparametric  estimate  of  the  regression  function.   Equation  (3.3) 
provides  an  estimate  of  the  true  least  squares  projection 


(3. A)   fK(z)  -  iQ  IIjK  (zJ) 


of  y^  on  z-^(K)  and  also  provides  an  estimate  of  the  least  squares  projection  of 
fg(z^)  on  z^(K)  because  e^  has  zero  mean  conditional  on  z^  by  Assumption  A.l  (i). 
Furthermore,  existence  of  the  moment  generating  function  of  Zj_,  given  assumption 
A.  3,  is  sufficient  for  polynomials  to  form  a  basis  for  the  space  of  square 
integrable  functions,  which  implies  that 


(3.5)   lim^  E  [(fo(zi)  -  fKUi))2]  -  0 
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For  K  large  enough  fj^(z^)   provides  an  arbirarily  good  mean  square  approximation 

of  f0(zi>. 

Given  the  nonparametric  estimate  of  the  true  regression  function, 
fR(z)  -  we  estimate  P  by  imposing  the  restrictions  implied  by  the  parametric  model 
on  f^(z) •   ^e  use  a  minimum  distance  approach  and  obtain  ft   from 

(3.6)  P   -  argmin  ^fi   J1  «(Xl)  [fj(Xj.)  -  f(Xi,  P)]2 

where  lo(x^)  is  a  nonnegative  weight  function  and  B  is  a  feasible  set  of  parameter 
values.  Note  that  the  observed  variable  x^  is  used  in  equation  (3.6)  in  place  of 
the  unobserved  variable  z^.  Other  observed  variables  could  also  be  used  which 
could  lead  to  more  efficient  estimators.  We  restrict  our  attention  here  to  the 
xj_  with  an  analysis  of  other  variables  left  to  future  research.  The  purpose  of 
the  weight  function  w(x^)  is  to  take  account  of  the  substitution  of  x^  for  the 
unobservable  z^  so  that  the  mean  square  approximation  holds 


(3.7)   lim^  E  [w(Xi)  { f  0  (xt)  -  fK  (Xi))2]  -  0 


Also,  we  require    regularity  conditions  and  an  identification  assumption  to 
proceed 

Assumption  k;       (i)  B  is  compact.   (ii)  f(x^,  P)    is  continuous  at  each  P   in  B 
with  probability  one.   (iii)  There  exists  d(.)  such  that 

SUPj9eB  I  f(Xi'  /9)  I  ~  d(Xi)  and  E  [d(xi)2j  is  finite-   <iv)  for 
all  p   in  B  such  that  p   /  p0,    E  [w(xi)(f(xi,  Pq)    -  f(Xi,  P))2]    >  0 

where  Pq   is  the  true  p. 
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Our  last  assumption  imposes  a  condition  on  the  weight  function 

Assumption  5:  The  distributions  of  z^  and  x^  are  absolutely  continuous  with 
densities  gz(.)  and  gx(-)  respectively  such  that  there  is  a  positive  constant  C 
with  w(.)gz(.)  <  C  gx(.). 

While  estimation  of  the  weight  function  may  sometimes  require  careful  attention, 
in  the  common  case  where  the  density  of  z^  is  continuous  and  nonzero  everywhere 
and  the  the  density  of  x^  is  bounded,  then  any  w(x^)  which  is  zero  outside  some 
bounded  set  will  suffice. 

Hausman,  Newey  and  Powell  (1988)  then  prove  that  given  the  assumptions 
and  the  additional  condition  that  the  density  of  z^  is  bounded  away  from  zero  on 
an  interval,  then  0  determined  from  equation  (3.6)  is  consistent,  plim  /3  -=  fi0,  if 
K  which  is  chosen  as  a  function  of  the  sample  size  K(n)  has  the  properties  that 
K(n)  -*  »  and  K(n)^  ln[K(n)  ] /ln(n)  -»  0.  Note  that  the  growth  rate  for  K  is 
somewhat  slower  than  the  square  root  of  the  natural  log  of  the  sample  size. 

In  this  section  we  have  discussed  a  consistent  estimator  for  the 
general  nonlinear  errors  in  variables  specification.  We  now  apply  this  estimator 
together  with  the  estimators  of  Section  II  to  estimate  Engel  curves  on  micro 
data.  Derivation  of  an  estimator  with  two  or  more  mismeasured  variables  and 
derivation  of  the  asymptotic  distribution  of  the  estimator  of  this  Section  are 
both  quite  complicated  problems  which  we  defer  to  future  research. 
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IV.  Estimation  of  Some  Engel  Curves 

Estimation  of  Engel  curves  has  long  been  an  area  of  interest  among 
econometricians .  Many  of  the  early  investigations  used  British  data,  and  the 
detailed  cross  section  information  collected  in  the  annual  Family  Expenditure 
Surveys  has  led  to  considerable  further  investigation.  Many  of  these  studies 
investigate  the  best  specification  for  the  form  of  the  Engel  curves;  Prais  and 
Houtthaker  (1955,1971)  and  Leser  (1963)  are  notable  examples.  The  "Leser- 
Working"  form  of  Engel  curve  in  which  budget  shares  are  regressed  on  the  log  of 
income  or  expenditure  has  been  widely  adopted  in  recent  research.  Both  the 
translog  specifications  of  Engel  curves,  e.g.  Jorgenson,  Lau  and  Stoker  (1982), 
and  the  AIDS  specification  of  Deaton  and  Muellbauer  (1980)  use  this 
specification.  An  alternative  specification,  the  quadratic  expenditure  system, 
which  specifies  budget  shares  as  a  function  of  both  the  inverse  of  expenditure 
and  it  square  has  been  estimated  by  Pollak  and  Wales  (1980). 

Economic  theory  gives  almost  no  general  guidance  in  specification  of 
Engel  curves.  "Adding-up"  of  budget  shares  to  one  is  the  only  restriction,  and 
this  restriction  is  typically  enforced  in  the  data.  However,  in  a  notable  paper 
Gorman  (1981)  considered  Engel  curves  in  which  either  expenditure  or  budget 
shares  are  specified  as  polynomials  in  functions  of  expenditure,  e.g.  log  of 
expenditure.  Gorman  makes  the  key  assumption,  as  does  most  of  the  previous 
demand  curve  literature,  that  the  polynomial  functions  which  contain  expenditure 
do  not  depend  on  price  in  the  demand  curve  specifications.  Given  this  "exactly 
aggregable  function" ,  Gorman  demonstrates  that  the  rank  of  the  matrix  of 
coefficients   for   the  polynomial   terms  in  income  is  at  most  three.      We 


Lewbel  (1986,1987)  further  considers  Gorman's  results  for  additional  Engel 
curve  specifications. 
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investigate  Engel  curve  specifications  of  the  Gorman  form  and  provide  tests  of 
his  rank  three  restriction. 

Few  studies  of  Engel  curves  have  used  estimators  other  than  least 
squares  or  nonlinear  least  squares.  The  most  notable  exception  is  Liviatan 
(1961).  Liviatan  noted  that  Friedman's  (1957)  relabelling  of  the  classic  errors 
in  variables  model  into  "permanent"  income  and  "transitory"  income  made  the  use 
of  current  income  as  a  predetermined  variable  inappropriate  in  family  budget 
studies.  Liviatan  also  noted  Summer's  (1959)  objection  to  the  use  of  current 
expenditure  as  the  predetermined  variable  because  of  reasons  of  joint 
endogeneity.  He  used  instrumental  variables  with  current  income  used  as  an 
instrument  for  current  expenditure.  ^  Livitan's  assumption  that  current  income 
is  uncorrelated  with  the  stochastic  disturbances  in  an  Engel  curve  specification 
seems  highly  questionable.  He  justified  the  assumption  based  on  Friedman's 
assertion  that  "permanent"  income  and  "transitory"  consumption  are  uncorrelated 
with  each  other.  However,  subsequent  econometric  research,  e.g.  Attfield  (1978), 
has  demonstrated  that  the  Friedman  assumption  is  unlikely  to  hold  true  in  family 
budget  data.  Thus,  alternative  instrumental  variables  are  necessary.  Here  we 
investigate  two  alternative  sets  of  instruments:  expenditure  in  other  periods  or 
determinants  of  income  and  expenditure  such  as  education  and  age.  Neither  of 
these  alternative  sets  of  instrumental  variables  should  be  correlated  with  the 
stochastic  disturbance  in  the  Engel  curve  specifications ,  although  we  test  the 
assumptions  subsequently. 


]  2 

Liviatan  used  IV  on  a  linear  Engel  curve  specification.   Leser  (1963)  applied 

a  variant  of  Liviatan' s  procedure.   However,  straightforward  IV  is  inapplicable 

to  all  of  Leser 's  Engel  curve  specifications,   except  his  first  two  linear 

specifications,  because  of  the  nonlinearity  of  his  specifications.   Inconsistent 

estimates  will  result  for  reasons  discussed  in  Section  I.   In  particular,  Leser's 

best  fitting  specification  (1963,  p.  699)  contains  terms  in  both  log  expenditure 

and  the  inverse  of  expenditure  which  makes  application  of  regular  IV  inappropriate. 
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We  first  consider  a  quadratic  generalization  of  the  Leser-Working  Engel 
curve  specification  where  the  budget  share  of  commodity  i  is  a  function  of  both 
log  expenditure  and  the  square  of  log  expenditure: 

(4.1)   Wi  -  £0  +  0!    log(z)  +  p2    log2(z)  +  ei 

where  (^  is  the  stochastic  disturbance.  However,  we  do  not  observe  actual 
expenditure,  but  we  instead  have  data  on  log  x  —  log  z  +  rj  where  we  assume  that 
the  error  of  observation  satisfies  the  properties  of  Assumption  A.l  (ii). 
Alternatively,  a  permanent  income  explanation  can  be  attached  to  equation  (4.1); 
however,  we  are  unwilling  to  make  any  assumption  about  the  relationship  of 
permanent  income  and  transitory  consumption.  Note  that  equation  (4.1)  satisfies 
the  Gorman  rank  3  condition,  while  the  usual  translog-AIDS  specifications  based 
on  the  Leser-Working  specification  have  rank  2  coefficient  matrices. 

Our  first  results  are  from  the  1982  Consumer  Expenditure  Survey  (CES). 
The  CES  collects  data  from  families  over  4  quarters  so  that  we  can  apply  the 
repeated  measurement  techniques  discussed  in  Section  II.  The  basic  data  we  use 
are  budget  share  and  total  expenditure  for  each  family  from  1982:1.  We  initially 
use  as  the  repeated  measurement  total  expenditure  from  1982:2.  The  repeated 
observation  estimator  of  equations  (2. 5) -(2. 7)  is  used.  We  estimate  Engel  curves 
on  5  commodity  groups:  food,  clothing,  recreation,  health  care,  and 
transportation.  We  report  elasticity  estimates  and  asymptotic  standard  errors  at 
3  quartiles  so  that  the  shape  of  the  Engel  curves  can  be  compared: 
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Table  4.1:   Expenditure  Elasticity  Estimates  Using  1982  CES  Data 
Repeated  Measurement  Estimator  using  1982:2 
(Asymptotic  Standard  Errors) 


IV  Estimates 

Commodity 

Percentile 

25th 

50th 

75th 

Food 

.83 

.74 

.63 

(.06) 

(.04) 

(.05) 

Clothing 

1.44 

1.43 

1.41 

(.16) 

(.08) 

(.13) 

Recreation 

1.47 

1.28 

1.12 

(-18) 

(.09) 

(.14) 

Health 

.009 

.10 

.44 

(.21) 

(.14) 

(.21) 

Transpor . 

1.19 

1.11 

1.02 

(.11) 

(.05) 

(.12) 

OLS 

Estimates 
Percentile 

25th 

50th 

75th 

.73 

.68 

.60 

(.03) 

(.02) 

(.03) 

1.29 

1.14 

.99 

(.06) 

(.04) 

(.04) 

1.51 

1.28 

1.08 

(.09) 

(.06) 

(.06) 

.50 

.56 

.68 

(.09) 

(.07) 

(.09) 

1.18 

1.35 

1.48 

(.06) 

(.04) 

(.04) 

Number  of  observations-  1324 


Overall,  with  the  exception  of  health  the  IV  and  OLS  estimates  are  reasonably 
close  and  both  accurately  estimate  the  elasticities.  A  joint  test  of  the  Leser- 
Working  specification,  that  all  the  02' s  are  zero,  is  computed  to  be  9.75  for  the 
IV  estimates.  The  marginal  significance  level  for  a  x  random  variable  with  5 
degrees  of  freedom  is  about  .08.  Similarly,  a  test  based  on  the  OLS  estimates  is 
computed  to  be  73.1  which  has  a  marginal  significance  level  of  less  than  .001. 
Thus,  both  the  estimates,  especially  for  food  and  recreation,  and  the  statistical 
tests  give  some  indication  that  a  quadratic  term  gives  better  estimates  of  Engel 
curves  on  individual  data.  The  usual  assumption  of  constant  budget  share 
elasticities  which  the  Leser -working  specification  imposes  appears  inconsistent 
with  the  1982  CES  data. 

While  the  IV  and  OLS  estimates  are  reasonable  close,  some  sizeable 
differences  do  occur.    For  instance  the  estimated  food  elasticity  at  the  25th 
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percentile  and  the  clothing  elasticities  at  the  50th  and  75th  percentile  are 
quite  different  with  both  sets  of  elasticities  estimated  with  a  high  degree  of 
accuracy.  Also,  the  estimated  transportation  elasticities  differ  by  a  range  of 
25%  to  50%  at  the  50th  and  75th  percentile  between  OLS  and  IV.  To  test  for  a 
possible  difference  we  do  a  Hausman  (1978)  type  specification  test  of  the  IV 
estimates  versus  the  OLS  estimates.  The  estimated  statistic  is  87.39  which  is 
distributed  as  a  \  random  variable  with  15  degrees  of  freedom.  Thus,  we  find 
strong  evidence  that  use  of  current  expenditure  in  estimation  of  Engel  curves  on 
micro  data  leads  to  errors  in  variables  problems.  An  alternative  method  to 
consider  the  problem  is  to  note  that  the  estimated  var(f|)  is  .108  while  the 
estimated  var(z)  is  .150.  Thus,  the  measurement  error  in  current  expenditure  is 
about  42%  of  the  total  variance  of  .258  of  the  logarithm  of  measured  expenditure. 
The  substantial  proportion  of  measurement  error  in  measured  expenditure  can  lead 
to  significant  problems  in  OLS-type  estimators. 

We  now  explore  the  Gorman  results.  Gorman's  theorem  implies  that 
higher  order  polynomial  terms  in  log  income  will  have  a  linear  relationship  to 
the  lower  order  terms  since  the  matrix  of  coef f icieiits  is  at  most  of  rank  three. 
For  the  polynomial  generalization  of  equation  (4.1),  the  rank  restriction  takes 
the  form  that  the  ratio  of  coefficients  of  the  cubic  terms  to  the  coefficients  of 
the  quadratic  terms  will  be  constant  across  equations.  First,  we  estimate  a 
generalization  of  equation  (4.1)  with  a  third  degree  term  in  log  income  included: 

(4.2)   Wi  -  £0  +  pl   log(z)  +  p2   l°g2U)  +  £3  log3(z)  +  ££. 


IT 

This  rank  restriction  result  follows  from  Gorman  (1981),  p.  16,  equation  (1) 
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Th  e  estimated  Engel  curve  and  elasticities  are  quite  similar  to  those  based  on 
the  quadratic  specification  of  equation  (4.1)  The  \  statistic  that  the  third 
order  terms  are  all  zero  for  the  IV  estimator  is  2.59  with  five  degrees  of 
freedom;  the  corresponding  statistic  for  the  OLS  estimates  is  10.57.  Thus,  the 
IV  estimates  do  not  demonstrate  much  evidence  for  more  than  a  quadratic  term  in 
the  budget  share  specification.  The  OLS  estimates,  with  a  marginal  significance 
level  of  about  .07,  are  more  ambiguous  about  the  cubic  terms.  However,  we  will 
use  the  quadratic  specification  in  our  subsequent  estimation  because  we  believe 
that  the  IV  estimates  are  likely  to  be  better  than  the  OLS  estimates. 

We  then  estimate  the  "Gorman  statistic"  to  see  whether  the  coefficients 
in  the  cubic  specific  of  equation  (A.  2)  have  rank  three.  We  find  a  rather 
remarkable  result  (which  we  hope  is  not  due  to  computational  error) .  Despite 
considerable  variation  in  the  estimates  of  the  ^2 ' s  an<^  t^ie  i$3's>  we  find  their 
ratios  to  be  remarkably  close  in  actual  values  and  estimated  precisely  although 
we  cannot  estimate  the  individual  coefficients  very  precisely.  Thus,  we  find  a 
more  special  result  than  Gorman's  result- -not  only  is  the  coefficient  matrix  of 
rank  three,  but  the  linear  dependency  takes  on  a  remarkably  simple  form. 

Table  4.2:   Estimated  Ratios  of  ^3/^2  for  Equation  (4.2) 
Commodity  IV  Ratio  OLS  Ratio 

Food                 -24.98  -129.35 

(0.47)  (10739) 

Clothing              -25.24  -22.6 

(0.56)  (2.46) 

Recreation            -25.12  -22.97 

(0.32)  (2.17) 

Health  Care           -23.31  -28.57 

(5.18)  (2.38) 

Transportation        -25.57  -34.09 

(0.31)  (9.05) 

X    (4)  Statistic         1.60  7.85 
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Thus ,  both  the  IV  results  and  the  OLS  results  demonstrate  that  the  Gorman  results 
on  rank  3  holds  in  the  1982  CES  data.  The  one  anomalous  result  for  OLS  is  for 
food  where  the  estimated  quadratic  term  is  very  near  zero.  This  estimate  leads 
to  the  high  estimated  Gorman  ratio  as  well  as  the  very  high  estimated  standard 
error  of  the  ratio.  Note  that  the  OLS  results  are  not  as  good  as  the  IV  results 
since  the  test  statistic  has  a  marginal  significance  level  of  about  .10. 
However,  as  before,  we  tend  to  prefer  the  IV  estimates.  Perhaps  the  results  are 
"too  good"  given  the  variation  in  prices  faced  by  families  in  the  sample  which  we 
have  no  data  on. 

We  now  reestimate  equations  (4.1)  and  (4.2)  by  IV  using  expenditure 
from  1982:3  in  place  of  expenditure  from  1982:2  to  form  the  instrument.  The 
results  are  very  similar  to  the  results  in  Table  4.1: 


Table  4.3:   Expenditure  Elasticity  Estimates  Using  1982  CES  Data 
Repeated  Measurement  Estimator  using  1982:3 
IV  Estimates 

Gorman  Statistic 


-25.18 

(.38) 
-25.18 

(2.23) 
-24.65 

(1.01) 
-28.34 

(7.37) 
-26.60 

(7.10) 

X2(4)  Statistic       0.44      Number  of  Obs-1324 


The  x      statistic  for  the  Gorman  ratios  again  shows  no  evidence  of  rejecting  the 
rank  3  restriction.   A  Hausman  (1978)  specification  test  type  statistic  for  IV 


Commodity 

Percentile 

25th 

50th 

75th 

Food 

.72 

.69 

.65 

(.06) 

(.04) 

(.06) 

Clothing 

1.50 

1.44 

1.38 

(.12) 

(.07) 

(.13) 

E.ecreation 

1.41 

1.47 

1.50 

(.17) 

(.11) 

(.20) 

Health 

.10 

.23 

.60 

(.18) 

(.13) 

(-23) 

Transpor . 

1.23 

1.09 

.94 

(-09) 

(.05) 

(.10) 
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versus  OLS  is  estimated  to  be  135.71,  which  is  distributes  as  \  with  15  degrees 
of  freedom.  Again,  a  strong  indication  is  found  of  the  importance  of  measurement 
error  in  the  micro  data  and  potential  problems  with  the  use  of  OLS-type 
estimators.  The  estimated  variance  of  the  measurement  error  is  .106  which  is 
extremely  close  to  previous  estimate  and  which  represents  41%  of  the  total 
variance  of  the  logarithm  of  measured  expenditure.  Thus,  both  sets  of  repeated 
measurement  estimates  yield  numerically  consistent  sets  of  coefficient  estimates. 
Up  to  this  point,  we  have  used  the  replicated  measurement  estimator  for 
our  Engel  curve  specifications.  Here  we  use  the  predicted  IV  estimator  of 
Section  II,  equations  (2.13)  and  equations  (2 . 15) - (2 . 16)  ,  where  the  instruments 
used  include  age,  education,  race,  union  membership,  spouse  age  and  employment, 
and  region  and  industry  dummy  variables.  Thus,  we  use  a  "predicted  value"  for 
expenditure  to  form  the  instruments  to  use  in  the  nonlinear  specifications  where 
the  R/  of  the  prediction  equation  is  about  0.3. 

Table  4.4:   Expenditure  Elasticity  Estimates  Using  1982  CES  Data 

Using  Predicted  Expenditure  Estimator 
IV  Estimates 

Commodity  Percentile  Gorman  Statistic   Overid  Statistic 

25th  50th      75th 

Food          .69  .62       .51  -25.39  3.12 

(.15)  (.06)  (.15)  (.15) 

Clothing     1.71  1.47  1.28  -25.22             1.35 

(.35)  (.01)  (.25)  (.21) 

Recreation   2.35  1.51       .93  -25.94  0.59 

(.45)  (.11)  (.31)  (.66) 

Health        .15  .24       .50  -25.26  4.68 

(.31)  (.15)  (.34)  (.15) 

Transpor.     1.59  1.02       .39  -26.61  1.46 

(.23)  (.08)  (.25)  (1.84) 

X2(4)  Statistic   2.43  Number  of  obs-1324 
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Except  for  recreation  and  for  transportation  at  the  75th  percentile  the  estimated 
elasticities  are  quite  close  to  the  repeated  measurement  estimates.  A  test  that 
all  the  quadratic  terms  are  zero  is  estimated  to  be  11.78  which  has  a  marginal 
significance  level  about  .04.  Thus,  again  we  find  evidence  that  higher  order 
terms  should  be  included  in  the  Engel  curve  specification.  A  Hausman 
specification  test  statistic  is  calculated  to  be  73.34  which  again  indicates  that 

n 

the  IV  estimates  are  better  than  the  OLS  estimates.  The  \  (2)  test  for  correct 
specification  from  equation  (2.21)  is  well  below  its  critical  value  of  6.0  at  the 
5  percent  level  for  each  commodity.  The  test  for  overidentif ication  does  not 
reject  our  specification. 

We  now  do  a  \  test  that  the  two  sets  of  repeated  measurement  IV 
estimates  are  the  same.  This  test  is  equivalent  to  a  test  of  the  overidentifying 
restrictions  on  the  instruments.  The  \  statistic  is  estimated  to  be  12.4,  and 
since  it  has  15  degrees  of  freedom,  no  evidence  is  found  to  reject  the  hypothesis 
of  orthogonality  of  the  instruments.  However,  the  equivalent  tests  for  the 
predicted  expenditure  form  of  the  IV  estimator  in  relation  to  the  repeated 
measurement  IV  estimators  yield  \  statistics  of  35.6  and  91.6,  respectively, 
both  of  which  indicate  that  either  the  repeated  measurement  or  the  predicted 
value  instruments  are  not  mutually  orthogonal  to  the  stochastic  disturbance  in 
the  Engel  curve  specifications.  Since  the  repeated  measurement  estimators  are 
mutually  consistent  with  each  other,  we  tend  to  believe  that  they  are  the 
superior  estimators  in  the  current  situation.  We  cannot  be  more  specific  about 
the  relative  superiority  of  the  estimators  without  further  research. 

We  repeat  the  IV  estimation  of  equations  (4.1)  and  (4.2)  using  1972  CES 
data  where   we   predict   expenditure  using   similar   instruments.      Repeated 


These  data  were  kindly  provided  to  us  by  Professor  Dale  Jorgenson. 
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observations  on  family  expenditure  are  not  available  for  1972.  The  1972  CES  data 

set  is  sufficiently  large  that  we  estimated  the  Engel  curve  only  on  4  person 

families  to  minimize  potential  family  size  effects  on  the  estimates.    The 
estimated  elasticities  are  reported  in  Table  4.5: 


Table  4.5:   Expenditure  Elasticity  Estimates  Using  1972  CES  Data 

Predicted  Expenditure  Estimator 
IV  Estimates 

Gorman  Statistic    Overid  Statistic 


1.23 

3.82 

2.39 

0.41 

2.31 
(2.60) 

X2(4)  Statistic   0.54  Number  of  obs  -  992 

The  estimated  elasticity  for  transportation  is  below  the  1982  estimates  which  may 
well  arise  from  the  extremely  large  rise  in  gasoline  prices  between  1972  and 
1982.  The  estimated  IV  Gorman  ratios  are  again  very  close,  and  a  x  test  fails 
to  come  close  to  a  rejection  of  equality.  ■*  Note  that  the  estimated  values  of 
the  Gorman  ratios  differ  from  their  estimated  values  in  1982.  This  result  is  to 
be  expected  since  the  slope  coefficients  are,  in  general,  nonlinear  functions  of 
prices.   The  CPI  increased  by  over  130%  between  1972  and  1982  with  significant 


Commodity 

Percentile 

25th 

50th 

75th 

Food 

.76 

.67 

.54 

(.12) 

(.07) 

(.13) 

Clothing 

1.43 

1.36 

1.22 

(.89) 

(.83) 

(.80) 

Recreation 

1.32 

1.41 

1.49 

(.13) 

(.10) 

(.17) 

Health 

1.07 

.78 

.44 

(.20) 

(.11) 

(.21) 

Transpor . 

.54 

.59 

.65 

(.18) 

(.10) 

(.19) 

IV 

OLS 

-42.1 

-44.5 

(.48) 

(4.73) 

-41.8 

-42.8 

(1.14) 

(.66) 

-41.3 

-43.5 

(2.23) 

(1.65) 

-41.1 

-42.0 

(1.67) 

(.50) 

-42.2 

-40.6 

(.22) 

(2.60) 

15   -  2 

Note  that  the  estimated  OLS  Gorman  ratios  are  also  very  close.   The  x  (4) 

statistic  for  the  OLS  estimates  is  1.87  which  indicates  no  grounds  for  rejection. 

For  completeness,  the  X  (5)  statistic  that  all  the  quadratic  terms  are  zero  is 

17.2  which  is  strong  evidence  against  the  Leser-Vorking  Engel  curve  specification 

on  micro  data. 
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differences  in  increases  across  expenditure  categories.  Thus,  the  ratios  of 
nonlinear  functions  of  prices  would  be  expected  to  change  as  prices  change.  A 
Hausman  (1978)  type  specification  test  of  IV  versus  OLS  is  estimated  to  be  117.8. 
Thus,  we  again  find  strong  evidence  of  the  importance  of  measurement  error  in  the 
1972  CES  data  as  we  did  in  the  1982  data.  Lastly,  the  x2(2)  of 
overidentif ication  of  equation  (2.21)  once  more  does  not  reject  our  specification 
of  the  Engel  curve  for  any  commodity. 

One  potential  problem  that  we  have  not  yet  accounted  for  is  errors  in 
variables  in  the  left  hand  side  variable,  the  budget  shares,  in  equations  (4.1) 
and  (4.2).  To  the  extent  that  the  errors  in  variables  occurs  in  expenditure  on  a 
given  commodity,  which  forms  the  numerator  of  the  budget  share,  no  special 
problem  arises.  However,  to  the  extent  that  the  denominator  of  the  budget  share, 
total  expenditure,  is  measured  with  error,  estimation  problems  arise.  Since  the 
measurement  error  enters  the  problem  in  a  non-polynomial  variable,  no  obvious 
solution  exists.  But  the  problem  can  be  eliminated  by  respecifying  equations 
(4.1)  and  (4.2)  with  commodity  expenditure  as  the  left  hand  side  variable  instead 
of  the  budget  share.  The  estimation  procedure  remains  the  same  except  for  an 
adjustment  to  the  estimated  standard  errors  of  the  coefficients  to  account  for 
heteroscedasticity . 

The  results  are  presented  in  Table  4.6.  We  do  not  find  that  errors  in 
variables  in  the  left  hand  side  variable  presents  a  significant  problem  although 
this  Engel  curve  specification  does  not  fit  the  data  as  well  as  the  earlier 
specifications . 
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Table  4.6:   Expenditure  Elasticity  Estimates  Using  1982  CES  Data 
Commodity  Expenditure  is  Left  Hand  Side  Variable 
IV  Estimates 


Gorman  Statistic 


-33.38 

(33.18) 

-15.26 

(1.49) 
-14.59 
(27.04) 
-16.61 

(1.13) 
-17.69 

(0.82) 


Commodity 

Percentile 

25th 

50th 

75th 

Food 

.89 

.73 

.62 

(.08) 

(.04) 

(.07) 

Clothing 

1.72 

1.61 

1.36 

(.23) 

(.11) 

(.16) 

Recreation 

1.55 

1.26 

1.05 

(.17) 

(.11) 

(.20) 

Health 

-.41 

.15 

.68 

(.39) 

(.24) 

(.32) 

Transpor . 

1.06 

1.22 

1.16 

(.33) 

(-13) 

(.30) 

*2(4)  Statistic    2.29      Number  of  Obs-1324 


The  elasticity  estimates  are  quite  close  to  the  elasticity  estimates  derived  from 
the  budget  share  results  of  Tables  4.1  and  4.3.  The  only  exception  is  the 
estimated  health  elasticity  at  the  25th  percentile  which  is  estimated  very 
imprecisely.  The  Gorman  ratios  are  all  quite  close  to  one  another  with  the 
exception  of  food,  which  again  is  measured  quite  imprecisely.  The  \  statistic 
does  not  come  close  to  a  rejection  of  the  Gorman  restriction.  Thus,  when  we 
estimate  the  Engel  curves  in  commodity  expenditure  form,  rather  than  budget  share 
form,  the  results  remain  essentially  unchanged.  We  again  find  support  for  the 
Gorman  restriction  on  the  specification  of  the  Engel  curve. 

Up  to  this  point  we  have  considered  only  polynomial  Engel  curves  for 
budget  share  data.  However,  Leser  (1963)  found  evidence  which  indicated  that  the 
following  Engel  curve  specification  was  superior  to  the  Leser-Vorking 
specification: 


(4.3)   Wi  -  p0   +  /9i  log(z)  +  £2/z  +  fi- 
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Thus ,  he  generalized  the  Working  specification  to  include  the  inverse  of  income 
as  well  as  its  logarithm.  We  consider  this  extended  Leser  specification  as  well 
as  another  generalization  of  the  Leser-Working  specification: 

(4. A)  wj[  -  /30  +  Pi  log(z)  +  p2  zlog(z)  +  ej.. 
Note  that  both  equations  (A.  3)  and  (4.4)  are  rank  two  specifications.  The 
coefficients  of  these  generalized  Engel  curve  specifications  are  estimated  using 
the  general  nonlinear  errors  in  variables  estimator  of  Section  III,  equation 
(3.6).  Recall  that  the  estimation  strategy  of  Section  III  involves  fitting  the 
Engel  curves  with  polynomials  followed  by  estimation  of  the  coefficients  of  the 
nonlinear  specifications  using  the  predicted  values  of  the  budget  shares  from  the 
polynomial  coefficient  estimates.  The  estimated  coefficients  of  equations  (4.3) 
and  (4.4)  follow  from  the  best  fitting  polynomial.  In  9  out  of  10  cases  the  best 
fitting  polynomial  is  a  second  degree  polynomial  with  the  sole  exception  being 
health  care  for  the  specification  of  equation  (4.4)  which  uses  a  third  degree 
polynomial . 

The  estimates  of  the  nonlinear  Engel  specifications  are  given  in  Table 
4.7 
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Table  4.7:   Estimates  Using  1982  CES  Data- -General  Nonlinear  Specifications 


16 


Eqi 

jation  (4 

.3) 

Commodity 

Percentile 

25th 

50th 

75th 

Food 

.82 

.71 

.59 

(.060) 

(.031) 

(.068) 

Clothing 

1.45 

1.45 

1.42 

(.13) 

(.11) 

(.15) 

Recreation 

1.43 

1.23 

1.09 

(.15) 

(.11) 

(.17) 

Health 

.06 

.28 

.60 

(.19) 

(.13) 

(.27) 

Transpor . 

1.16 

1.08 

1.01 

(.08) 

(.07) 

(.14) 

Equation  (4 

•  4) 

Percentile 

25th 

50th 

75th 

.82 

.76 

.67 

(.057) 

(.043) 

(.045) 

1.45 

1.41 

1.39 

(.13) 

(.076) 

(.084) 

1.45 

1.33 

1.20 

(.17) 

(.10) 

(.10) 

.12 

.04 

.15 

(.15) 

(.17) 

(.22) 

1.17 

1.12 

1.07 

(.09) 

(.06) 

(.08) 

Note  that  the  estimates  of  the  elasticities  are  quite  similar  between  the  two 
nonlinear  specifications.  Furthermore,  the  estimated  elasticities  are  close  to 
the  estimated  elasticities  for  the  polynomial  specifications  in  Table  4.1.  We 
compare  the  closeness  of  fit  of  the  nonlinear  specifications  to  the  predicted 
values  of  the  underlying  polynomials  since  that  is  the  criterion  used  to  estimate 
the  coefficients  in  equation  (3.6).  For  4  of  the  5  commodities,  the  extended 
Leser  specification  of  equation  (4.3)  fits  better  than  the  generalized 
specification  of  equation  (4.4).  The  exception  is  health  care  where  none  of  the 
Engel  curve  specifications  do  very  well.  However,  to  the  extent  that  the 
estimated  elasticities  are  so  similar,  the  choice  of  the  "best"  specification  is 
probably  a  rather  unimportant  exercise. 

We  now  consider  an  additional  nonlinear  specification  which  accounts 
for  possible  measurement  errors  in  the  left  hand  side  variables,  the  budget 
shares.  We  take  the  quadratic  generalization  of  the  Leser-Working  Engel  curve  of 
equation  (4.1)  and  multiply  both  sides  of  the  equation  by  total  expenditure: 


16 


Standard  errors  are  calculated  by  the  bootstrap  method  here. 
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(4.5)   ei  -  j80  z  +  01    zlog(z)  +  p2   zlog2(z)  +  ei. 


In  equation  (4.5)  commodity  expenditure  is  now  the  left  hand  side  variable  which 
eliminates  possible  problems  from  measurement  error  in  the  denominator  of  the 
budget  shares  in  equations  (4.1)  and  (4.2).  Our  estimates  of  equation  (4.5)  are 
very  similar  to  the  estimates  in  Table  4.7  and  earlier  tables.  For  instance,  the 
estimated  elasticities  at  the  50th  percentile  are  (0.69,  1.46,  1.17,  0.37,  1.13). 
Thus,  the  nonlinear  specification  of  the  quadratic  version  of  the  Leser-Working 
Engel  curve  yields  estimates  very  close  to  our  previous  estimates  so  that 
measurement  error  in  the  left  hand  side  variable  again  does  not  seem  to  be  an 
important  problem. 

Our  final  exploration  of  the  Engel  curve  specification  involves  the 
addition  of  demographic  variables  in  equation  (4.1).  Differences  in  household 
size  have  often  been  a  focus  of  attention  in  the  specification  of  Engel  curves; 
here  we  also  include  region  of  the  U.S.  to  account  for  regional  price  differences 
of  the  commodities  as  well  as  age  of  the  household  head  to  account  for  life  cycle 
effects.  The  demographic  variables  are  all  entered  as  indicator  (dummy) 
variables  with  4  family  size  groups,  4  region  groups,  and  4  age  groups.  We 
believe  that  this  specification  is  preferable  to  the  non- identified  approach  of 
family  equivalence  scale  specifications.  The  approach  of  equation  (2.23)  is  used 
to  include  the  additional  right  hand  side  variables  in  the  Engel  curve 
specifications . 

We  now  reestimate  Tables  4.1  and  4.2  where  we  include  the  demographic 
variables  and  used  the  repeated  measurement  estimator. 


Commodity 

Percentile 

25th 

50th 

75th 

Food 

.72 

.66 

.59 

(-05) 

(.06) 

(.07) 

Clothing 

1.47 

1.43 

1.39 

(.14) 

(.13) 

(.12) 

Recreation 

1.17 

1.16 

1.15 

(.17) 

(.17) 

(.17) 

Health 

.21 

-.09 

-.33 

(.17) 

(.23) 

(.39) 

Transpor . 

1.07 

1.07 

1.06 

(.10) 

(.10) 

(.11) 
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Table  4.8:   Expenditure  Elasticity  Estimates  Using  1982  CES  Data 
Repeated  Measurement  Estimator  using  1982:2 
IV  Estimates 

Gorman  Statistic 


-9.93 
(6.17) 
-10.44 

(8.23) 
-14.80 
(.43) 
-14.01 

(.52) 
-15.34 
(.93) 

X2(4)  Statistic       1.90      Number  of  Obs-1321 

3  Family  Size,  3  Age,  and  3  Region  variables  are  included 

The  estimated  quartile  elasticities  change  very  little  from  the  specification 
which  omits  demographics,  with  the  sole  exception  of  the  health  equation.  The 
health  equation  elasticities  are  estimated  very  imprecisely  with  the  negative 
coefficient  estimates  accompanied  by  quite  large  asymptotic  standard  error 
estimates.  The  Gorman  ratios  are  not  as  close  as  in  Table  4.2  although  the  test 
statistic  takes  on  almost  the  same  value  which  indicates  no  reason  to  reject  the 
Gorman  restrictions.  The  food  and  clothing  ratios  are  smaller  in  magnitude  than 
the  other  three  commodities.  However,  the  Gorman  ratio  for  food  and  clothing  are 
estimated  very  imprecisely.  We  again  find  strong  evidence  against  the  Leser- 
Working  specification  of  equation  (4.1)  and  evidence  in  favor  of  the  cubic 
specification  of  equation  (4.2).  The  x  (5)  statistic  is  estimated  to  be  48.2. 
The  Hausman  (1978)  test  of  IV  versus  OLS  with  demographics  is  473.0  which 
indicates  strong  evidence  of  measurement  error  since  it  is  distributed  as  a 
X    (10)  random  variable  under  the  null  hypothesis. 
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Despite  the  closeness  of  the  quartile  elasticity  estimates  without  and 
without  demographic  variables  included,  we  do  find  a  quite  significant  influence 
of  demographic  variables  on  expenditures  shares.  We  present  the  estimates 
results  in  Table  4.9: 

Table  4.9:   Coefficient  Estimates  for  Demographic  Variables 

HC    Transportation 

.01 
(-02) 

.03 
(.03) 

.01 
(.01) 

-.00 
(.01) 

-.01 
(.04) 

.01 
(.04) 

.01 
(.01) 

.01 
(.01) 

.00 
(.01) 

05       .06         .21 

We  find  fairly  sizeable  age  effects  in  the  estimated  share  equations.  The  region 
effects  are  not  particularly  large  which  helps  support  the  necessary  assumption 
of  constant  prices  across  the  US  in  the  Engel  curve  specification.   The  notable 


Commodity 

Food 

Clothi 

Agel  (19-29) 

-.04 

-  .01 

(.01) 

(.01) 

Age2  (30-39) 

.04 

.01 

(.02) 

(.01) 

Age3  (40-49) 

-.00 

.00 

(.01) 

(.00) 

Regl  (NE) 

.01 

.00 

(.01) 

(.00) 

Reg2  (W) 

.01 

.01 

(.03) 

(.01) 

Reg3  (MW) 

-.01 

.01 

(.03) 

(.01) 

Sizel  (2) 

-.00 

-.00 

(.01) 

(.00) 

Size2  (3) 

.01 

-.01 

(.01) 

(.00) 

Size3  (4) 

.01 

-.00 

(.00) 

(.00) 

Mean  Share 

.22 

.06 

-.00 

-.02 

(.01) 

(.01) 

.01 

.07 

(.01) 

(.02) 

.00 

.01 

(.01) 

(.01) 

.01 

-.00 

(.00) 

(.01) 

-.01 

-  .08 

(.02) 

(.03) 

.02 

.01 

(.02) 

(.03) 

-.01 

.01 

(.00) 

(.01) 

-.01 

.02 

(.01) 

(.01) 

.00 

.00 

(.00) 

(.00) 

exception  is  the  Western  region  for  health  care.  The  health  care  equation  is 
difficult  to  estimate  overall;  the  estimated  effect  here  may  arise  from  the  much 
larger  share  of  health  maintenance  organizations  in  the  West  in  1982.  The  family 
size  effects  are  statistically  significant  although  they  have  only  a  small  effect 
compared  to  the  shares  in  expenditure  of  the  five  commodities. 

We  then  reestimated  the  Engel  curve  specifications  for  1982  using  the 
second  repeated  measurement.  The  estimated  elasticities,  not  reported  here,  are 
quite  similar  to  Table  4.7  and  the  demographic  effects  are  quite  similar  to  Table 
A. 8.  The  5  Gorman  ratios  are  estimated  to  be  (-8.26,  -8.10,  -9.51,  -7.68,- 
11.86).  Thus,  the  ratios  are  once  again  quite  close  to  each  other  with  a  x  C-0 
test  statistic  of  3.97.  The  test  for  the  quadratic  against  the  cubic 
specification  is  86.4  which  once  again  gives  strong  evidence  against  the  Leser- 
Working  specification.  The  Hausman  test  statistic  of  IV  versus  OLS  is  estimated 
to  be  675.8.  However,  when  we  include  the  demographic  variables  the  two  sets  of 
replicated  measurement  results  are  no  longer  nearly  so  mutually  consistent  as 
before.  The  x  (10)  statistic  for  the  test  of  overidentif ication  is  estimated  to 
be  50.4  which  easily  rejects  the  overidentifying  restrictions. 

In  this  section  we  have  estimated  various  Engel  curve  specifications 
where  we  have  taken  account  of  errors  in  measurement  in  income  or  expenditure. 
We  find  very  strong  evidence  that  substantial  measurement  error  exists  in  the  CES 
data.  We  also  find  strong  evidence  that  equation  (4.2)  is  preferable  to  the 
Leser-Working  specification  of  equation  (4.1).  However,  we  do  not  find  support 
for  the  hypothesis  that  more  general  nonlinear  specifications  or  higher  order 
polynomial  terms  than  cubic  are  needed.  We  find  strong  support  for  the  Gorman 
rank  condition  which  limits  polynomial  Engel  curve  specifications  to  rank  3.  Our 
results  also  show  that  demographic  variables  have   significant  effects  in  Engel 
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curve  specifications  of  the  share.  Lastly,  we  have  demonstrated  the  feasibility 
and  importance  of  using  consistent  instrumental  variable  estimators  in  nonlinear 
econometric  model  specifications. 
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